Graph Based Similarity Measures for Synonym Extraction from Parsed Text
نویسندگان
چکیده
We learn graph-based similarity measures for the task of extracting word synonyms from a corpus of parsed text. A constrained graph walk variant that has been successfully applied in the past in similar settings is shown to outperform a state-of-the-art syntactic vectorbased approach on this task. Further, we show that learning specialized similarity measures for different word types is advantageous.
منابع مشابه
Learning Graph Walk Based Similarity Measures for Parsed Text
We consider a parsed text corpus as an instance of a labelled directed graph, where nodes represent words and weighted directed edges represent the syntactic relations between them. We show that graph walks, combined with existing techniques of supervised learning, can be used to derive a task-specific word similarity measure in this graph. We also propose a new path-constrained graph walk meth...
متن کاملNatural Language Engineering
We consider a dependency-parsed text corpus as an instance of a labeled directed graph, where nodes represent words and weighted directed edges represent the syntactic relations between them. We show that graph walks, combined with existing techniques of supervised learning that model local and global information about the graph walk process, can be used to derive a task-specific word similarit...
متن کاملAdaptive Graph Walk Based Similarity Measures in Entity-Relation Graphs
Relational or semi-structured data is naturally represented by a graph schema, where nodes denote entities and directed typed edges represent the relations between them. Such graphs are heterogeneous in the sense that they describe different types of objects and multiple types of links. For example, email data can be described in a graph that includes messages, persons, dates and other objects;...
متن کاملTwo Approaches for QA4MRE: Information Retrieval and Graph-based Knowledge Representation
In this paper we present our approaches for tackling the QA4MRE 2013 main task. We have built two different methodologies, one based on information retrieval and the other one based on graph representations of the text, additionally we have built a third hybrid methodology combining both of the previous one. The first methodology uses the Lucene information retrieval engine for carrying out inf...
متن کاملAutomatic Discovery of Similar Words
We deal with the issue of automatic discovery of similar words (synonyms and near-synonyms) from different kind of sources: from large corpora of documents, from the Web, and from monolingual dictionaries. We present in detail three algorithms that extract similar words from a large corpus of documents and consider the specific case of the World Wide Web. We then describe a recent method of aut...
متن کامل